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Abstract — Flash memory is a non-volatile computer memory 
comprised of blocks of cells, wherein each cell can take on q dif- 
ferent values or levels. While increasing the cell level is easy, re- 
ducing the level of a cell can be accomplished only by erasing an 
entire block. Since block erasures are highly undesirable, coding 
schemes — known as floating codes or flash codes — have been de- 
signed in order to maximize the number of times that information 
stored in a flash memory can be written (and re-written) prior 
to incurring a block erasure. An {n,k,t)q flash code C is a cod- 
ing scheme for storing k information bits in n cells in such a way 
that any sequence of up to t writes (where a write is a transition 
^ 1 or 1 ^ in any one of the k bits) can be accommodated 
without a block erasure. The total number of available level tran- 
sitions in n cells is n{q—\), and the write deflciency of C, defined 
as i5(C) = n{q—l) — t, is a measure of how close the code comes 
to perfectly utilizing all these transitions. For k> 6 and large n, 
the best previously known construction of flash codes achieves 
a write deficiency of 0{qk^ ). On the other hand, the best known 
lower bound on write deficiency is 0.(qk). In this paper, we pre- 
sent a new construction of flash codes that approaches this lower 
bound to within a factor logarithmic in k. To this end, we first im- 
prove upon the so-called "indexed" flash codes, due to Jiang and 
Bruck, by eliminating the need for index cells in the Jiang-Bruck 
construction. Next, we further increase the number of writes by 
introducing a new multi-stage (recursive) indexing scheme. We 
then show that the write deflciency of the resulting flash codes 
is 0{qklogk) it q ^ 1082''^' most O(fclog^fc) otherwise. 

I. Introduction 

Flash memories are, by far, the most important type of nonvol- 
atile computer memory in use today. Flash devices are employ- 
ed widely in mobile, embedded, and mass-storage applications, 
and the growth in this sector continues at a staggering pace. 

A flash memory consists of an array of floating-gate cells, 
organized into blocks (a typical block contains 2^^ to 2^^ cells). 
The level or "state" of a cell is a function of the amount of 
charge (electrons) trapped within it. In multilevel flash cells, 
voltage is quantized to q discrete threshold values; conseq- 
uently the level of each cell can be modeled as an integer in the 
range 0, 1, ... , q—l- The parameter q itself ranges from q = 2 
(the conventional two-state case) up to = 256. The most con- 
spicuous property of flash-storage technology is its inherent 
asymmetry between cell programming (charge placement) and 
cell erasing (charge removal). While adding charge to a sin- 
gle cell is a fast and simple operation, removing charge from 
a cell is very difficult. In fact, flash technology does not allow 
a single cell to be erased — rather, only entire blocks can be 
erased. Such block erasures are not only time-consuming, but 
also degrade the physical quality of the memory. For example, 
a typical block in a multilevel flash memory can tolerate only 
about 10^ erasures before it becomes unusable. Therefore, it 
is of importance to design coding schemes that maximize the 
number of times information stored in a flash memory can be 
written (and re-written) prior to incurring a block erasure. 

Such coding schemes — known as floating codes or flash 
codes — were first introduced in [3] two years ago. Since then, 
a few more papers on this subject have appeared in the hter- 



ature [2,4,6,8]. It should be pointed out, however, that flash 
codes may be regarded as a generalization of codes for write- 
once memories [1,7], that were studied since the early 1980s. 

An {n,k, t)q flash code C is a coding scheme for storing k 
information bits in n flash-memory cells, with q levels each, in 
such a way that any sequence of up to t writes can be accom- 
modated without incurring a block erasure. In the literature on 
flash codes, a write is always a bit-write — that is, a change 
^ 1 or 1 ^ in the value of one of the k information bits. 
Observe that in order to accommodate such a write, at least one 
of the n cells must transition from a lower level to a higher 
level (since a cell's level, determined by its charge, can only in- 
crease). On the other hand, the total number of available level 
transitions in /; flash cells \sn{q—l). Thus, throughout this pa- 
per, we characterize the performance of a flash code C in terms 
of its write deflciency, defined as ^(C) = n{q—l) —t. Ac- 
cording to the foregoing discussion, <5(C) is a measure of how 
close C comes to perfectly utilizing all the available cell-level 
transitions: exactly one per write. The primary goal in design- 
ing flash codes can thus be expressed as minimizing deficiency. 

What is the smallest possible write deficiency 5q{n, k) for an 
(«, k, t)q flash code, and how does it behave asymptotically as 
the code parameters k and n get large? The best-known lower 
bound, due to Jiang, Bohossian, and Bruck [3], asserts that 

8q{n,k) ^ '^{q-l)vcvin{n,k-l} (1) 

How closely can this bound be approached by code construc- 
tions? It appears that the answer to this question depends on 
the relationship between k and n. In this paper, we are con- 
cerned mainly with the case where both k and n are large, and 
n is much larger than k (in particular, n ^ k^). In SectionlVl 
we briefly consider the case k/n = const. At the other end of 
the spectrum, the case k > n has been recently studied in [5]. 

The first construction of flash codes for large k was reported 
by Jiang and Bruck [4]. In this construction, the k information 
bits are partitioned into mi = k/k' subsets of k' bits each (with 
k' ^ 6) while the memory cells are subdivided into m2 ^ ni-[ 
groups of n' cells each. Additional memory cells (called index 
cells) are set aside to indicate for each subset of k' bits which 
group of n' memory cells is used to store them. The deficiency 
of the resulting codes is at least 0{^/qn). Note that for n ^ k, 
the lower bound on write deficiency in ([T]) behaves as D.{qk), 
and thus does not depend on n. Consequently, the gap between 
the Jiang-Bruck construction [4] and the lower bound could be 
arbitrarily large, especially when n is much larger than k. 

Recently, we have proposed in [8] a completely different 
construction of flash codes. These codes are based upon rep- 
resenting the n memory cells as a high-dimensional array, and 
achieve a write deficiency of 0{qk^). Crucially, the deficiency 
of these codes does not depend on n. Nevertheless, there is 
still a significant gap between 0{qk^) — which is the best cur- 
rently known result — and the lower bound of D.{qk). 



In this paper, we present a new construction of flash codes 
which reduces the gap between the upper and lower bounds on 
write deficiency to a factor that is logarithmic in the number 
of information bits k. This result is arrived at in several stages. 
As a starting point, we use the "indexed" flash codes of Jiang 
and Bruck [4]. In Sectionlllll we develop new encoding and 
decoding procedures for such codes that eliminate the need for 
index cells in the Jiang-Bruck construction [4]. The write de- 
ficiency achieved thereby is 0{qk^), which coincides with the 
main result of [8]. When the encoding procedure developed in 
SectionHni reaches its limit, there are still potentially numer- 
ous unused cell-level transitions. In Section HVl we show how 
to take advantage of these transitions in order to accommodate 
even more writes. To this end, we introduce a new indexing 
scheme, which is invoked only after the encoding method of 
SectionHnlis exhausted. Thereupon, we extend this idea recur- 
sively, through [log2A:] different indexing stages. This leads to 
the main result of this paper, established in Theorem[3] namely 

D.{qk) ^ 5ti{n,k) ^ 0{max{q,log2k} klogk) (2) 

for all n ^ k^, where the upper bound is achieved constructi- 
vely by the flash codes described in Section HV] 

Finally, in Section|Vj we present and briefly discuss constr- 
uctions of flash codes for the case where the number of mem- 
ory cells n is not significantly larger than the number of bits k. 

II. Preliminaries 

Let us now give a precise definition of flash codes that were in- 
troduced less formally in the previous section. We use {0, 1}*^ 
to denote the set of binary vectors of length k, and refer to the 
elements of this set as information vectors. The set of possible 
levels for each cell is denoted by Aq = {0, 1, . . . , and 
thought of as a subset of the integers. The q" vectors of length 
n over Aq are called cell-state vectors. With this notation, any 
flash code C can be specified in terms of two functions; an 
encoding map £ and a decoding map V. The decoding map 
T) : A^^ —> {0, 1}*^ indicates for each cell-state vector x E A^ 
the corresponding information vector In turn, the encoding 
map £: {0, 1, . . .,k-l}xA^^ Aq U {E} assigns to every 
index i and cell-state vector x E Aq, another cell-state vector 
y = £{i,x) such that i/j ^ Xj for all / and T>{y) differs from 
T>{x) only in the i-th position. If no such y E A'^ exists, then 
£{i,x) = E indicating that block erasure is required. To boot- 
strap the encoding process, we assume that the initial state of 
the n memory cells is (0, 0, . . . , 0). Henceforth, iteratively ap- 
plying the encoding map, we can determine how any sequence 
of transitions ^ 1 or 1 ^ in the k information bits maps 
into a sequence of cell-state vectors, eventually terminated by 
the block erasure. This leads to the following definition. 

Definition. An {n, k)q flash code C{'D ,£) guarantees t writes 

if for all sequences of up to t transitions ^ 1 or 1 ^ i/i the 
k information bits, the encoding map £ does not produce the 
block erasure symbol E. If so, we say that C is an {n,k,t)q 
code, and define the deficiency of C as <5(C) = n{q—l) — t. 

In addition to this definition, we will also use the following 
terminology. Given a vector x = {x\,xi, . . . ,Xm) over Aq, we 
define its weight as wt(x) = x-[ + x-[ + ■ ■ ■ + x,,, (where the 
addition is over the integers), and its parity as wt(x) mod 2. 



III. Index-less Indexed Flash Codes 

Our point of departure are the so-called indexed flash codes, 
due to Jiang and Bruck [4], that were briefly described in Sec- 
tionU In this section, we eliminate the need for index cells — 
and, thus, the overhead associated with these cells — in the 
Jiang-Bruck construction [4]. This is achieved by "encoding" 
the indices into the order in which the cell levels are increased. 

As in [4], we partition the n memory cells into m groups of 
n' cells each. However, while in [4] the value of n' is more or 
less arbitrary, in our construction n' = k. We henceforth refer 
to such groups of n' = k cells as blocks (though they are not 
related to the physical blocks of floating-gate cells which com- 
prise the flash memory). We will furthermore use, throughout 
this paper, the following terminology. We say that: 

► a block is full if all its cells are at level q—l; 

► a block is empty if all its cells are at level zero; 

► a block is active if it is neither full nor empty; 

► a block is live if it is not full (either active or empty). 
In our construction, each block represents exactly one bit. This 
implies that the total number of blocks, given by m = [n /k\ , 
must be at least k, which in turn implies n ^ k^. If n is not 
divisible by k, the remaining cells are simply left unused. Fi- 
nally, we also assume that either k is even or q is odd. If this 
is not the case, we can invoke the same construction with k 
replaced by fc + 1 (and the last bit permanently set to zero). 

The key idea is that each block is used to encode not only 
the current value of the bit that it represents, but also which of 
the k bits it represents. The value of the bit is simply the par- 
ity of the block. The index of the bit is encoded in the order 
in which the levels of the k cells are increased. For example, if 
the block stores the i-th bit, first the level of the f-th cell in the 
block is increased from to i^— 1 in response to the transitions 
0^1 and 1 ^ in the bit value. Then, the same procedure is 
applied to the (/+l)-st cell, the (/+2)-nd cell, and so on, with 
the indices / + 1, / + 2, . . . interpreted cyclically (modulo k). 
This process is illustrated in the following example. 
Example. Suppose that k = A and = 3. If a block represents 
the first bit, then its cell levels will transition from (0, 0, 0, 0) 
to (2,2,2,2) in the following order: 

(0000) ^ (1000) ^ (2000) ^ (2100) ^ (2200) 
^ (2210) ^ (2220) ^ (2221) {1111) 
On the other hand, for a block that represents the second bit, 
the corresponding cell- writing order is given by: 

(0000) ^ (0100) ^ (0200) ^ (0210) ^ (0220) 
-> (0221) (0222) {1111) {1111) 
The cell-writing orders for blocks that represent the third and 
fourth bits are given, respectively, by 

(0000) ^ (0010) ^ (0020) ^ (0021) ^ (0022) 
^ (1022) ^ (2022) ^ (2122) {1111) 

(0000) ^ (0001) ^ (0002) ^ (1002) ^ (2002) 
(2102) ^ (2202) (2212) {1111) 

Note that, unless a block is full, it is always possible to deter- 
mine which cell was written first and, consequently, which of 
the fc = 4 bits this block represents. □ 
We now provide a precise specification of an {n,k)q flash 
code C based upon this idea, in terms of a decoding map 'Dq 
and an encoding map £q. In what follows, these maps are de- 
scribed algorithmically, using (C-like) pseudo-code notation. 



Decoding map T>q : The input to this map is a cell-state vector 
X = {xi\x2\ ■ ■ ■ \xnt), partitioned into m blocks of k cells. The 
output is the information vector {vq, Vi, . . . , Ut-i). 

{vo,vi,...,Vk_i) = (0,0,. . .,0); 

for (;' = 1; ; ^ m; / = ; + 
if (active (xy)) 

{ i = read-index (x,) ; Vi = parity (xj) ; } 



Encoding map Sq : The input to this map is a cell-state vector 
X = {x\\x2\ ■ ■ ■ \xni), partitioned into ni blocks of k cells, and 
an index i of the bit that has changed. Its output is either a cell- 
state vector 1/ = (J/1I1/2I ■ ■ ■ ll/m) or the erasure symbol E. 



To complete the specification of the flash code C{'Do, Sq), 
let us elaborate upon all the functions used in the pseudo-code 
above. The functions active (x) , respectively empty {x) , 
simply determine whether the given block is active, respectiv- 
ely empty. The function parity (x) computes the parity of x, 
defined in Section|II] Note that the parity of a full block is al- 
ways zero (since k{q—l) is even, by assumption). The function 
read-index (x) computes the bit-index encoded in an active 
block X = (xg, Xj, . . . , This can be done as follows. 

Find all the zero cells in x. Note that these cells always form 
one cyclically contiguous run, say Xj, Xj^i, . . . , Xj^y (where 
the indices are modulo k). Then the index of the corresponding 
bit is f = ; ' + r + 1 (mod k). If there are no zeros in x, there 
must be exactly one cell, say Xj, whose level is strictly less 
than In this case the bit-index is = ; ' + 1 (mod k). The 
function write (y) proceeds along similar lines. Find the sin- 
gle cyclically contiguous run of zeros in (yo/ 3/1/ ■ • ■ / J/it-l)' 
say i/y, i/y+i, . . . , y j+r- If Vj-l < increase yj-i by one; 
otherwise set yj = l. If there are no zeros in y, find the unique 
cell yj such that yj < q—1 and increase its level by one. Fin- 
ally, the function write-new (f, y) simply sets y; = 1. 

Theorem 1. The write deficiency of the flasti code C{'Dq, Sq) 
described above is at most 

(fc-l)((fc + l)(^-l) - 1) = 0{qk^) (3) 

Proof. Note that at each instance, at most k of the m blocks 
are active. The encoding map £Q{i,x) produces the symbol E 
when there are no more empty blocks, and none of the active 
blocks represents the f-th bit. In the worst case, this may occur 
when there are k — 1 active blocks, each using just one cell 
level. This contributes [k— l)(k{q—l) — 1) unused cell lev- 
els. In addition, there are at most k — 1 cells that are unused 
due to the partition into m = [n/k\ blocks of exactly k cells. 
These contribute at most {k — l){q—l) unused cell levels. | 



IV. Nearly Optimal Construction 
It is apparent from the proof of Theorem[T] that the deficiency 
of the flash code C(2?0/^o)' constructed in Sectionlllll is due 
primarily to the following: when writing stops, there are still 
potentially numerous unused cell levels. The key idea devel- 
oped in this section is to continue writing after the encoding 
map 8q produces the erasure symbol E, utilizing those cell lev- 
els that are left unused by £q. Obviously, it is not possible to 
continue writing using the same encoding and decoding maps. 
However, it may be possible to do so if, at the point when £q 
produces the erasure symbol E, we switch to a different encod- 
ing procedure, say £1. In fact, this idea can be applied itera- 
tively: once £1 reaches its limit, we will transition to another 
encoding map £2, then yet another map £3, and so on. 

Assuming that k = (mod 4), here is one way to continue 
writing after the encoding map f q been exhausted. When 
£0 produces the erasure symbol E, we say that the first stage of 
encoding is over and transition to the second stage, as follows. 
First, we re-examine the cell-state vector x = (xi|x2| ■ ■ ■ |x„i) 
and re -partition it into 2m = 2 [n/k\ blocks ofk/2 cells each. 
Most of these smaller blocks will be already full, but we may 
find some m-i of them that are either empty or active (live). Ob- 
serve that mi ^ 2(k—l) since at the end of the first stage, 
there are at most k — 1 active blocks of k cells, and each of 
them produces at most two live (non-full) blocks ofk/2 cells. 

If m\ ^ k, we can continue writing as follows. Once again, 
each of the m\ blocks will represent exactly one bit; as before, 
the value of this bit is determined by the parity of the block. As 
part of the transition from the first stage to the second stage, 
we record the current information vector {vq, Vi, . . . , Vj^_i) in 
the first k of the nii live blocks, say Xi, X2, . . . , x^^. To this end, 
whenever parity (x, ) 7^ ^i^i, we increase the level of one 
of the cells in x, by one; otherwise, we leave x, as is. 

Since the blocks now have k/2 cells rather than k cells, it is 
no longer possible to encode in each block which of the k infor- 
mation bits it represents. Therefore, we set aside for this pur- 
pose 2(k—\) [logg(fc+2)] index cells (that are not used during 
the first stage). These cells are partitioned into 2{k—l) blocks 
of ]i= [log^(A:+2)] cells each, which we call index blocks. 
Henceforth, it will be convenient to refer to the blocks ofk/2 
cells as parity blocks, in order to distinguish them from the 
index blocks. Initially, the first k index blocks u\, U2, ■ • ■ , M/t 
are set so that m, = i (in the hase-q number system), which 
reflects the fact that the information bits fo/^l/ ■ • ■ r'^k-l 
stored (in that order) in the first k live parity blocks. The next 
nii — k index blocks are set to (0, 0, ... , 0), thereby indicat- 
ing that the corresponding (live) parity blocks are available to 
store information bits. The last 2{k—l) — nii index blocks are 
set to [q—l, q—1, . . .,q—l) to indicate that the correspond- 
ing parity blocks are full (in fact, nonexistent). Finally, it is 
possible that in the process of enforcing parity (x, ) = 
for the first k live parity blocks, some of these blocks become 
fufl (this happens iff wt(x,) = {k/2){q—l) — 1 and Vj = 
at the end of the first stage, since k/2 is even by assump- 
tion). To account for this fact, we set the corresponding index 
blocks to {q—1, q—1, . . .,q—l). This completes the transition 
from the first stage to the second stage, which is invoked when 
the encoding map £q produces the erasure symbol E. 

Let us now summarize the foregoing discussion by giving 
a concise algorithmic description of the transition procedure. 



iVllVll ■ ■ ■ IVm) = ixi\X2\ ■ ■ ■ \Xm)} 

for (/ = 1; ^ m; / = j + l) 

if (active (xy) A (read-index (xy) == /) ) 

{ write (yj) ; break; } 

if (; == m + 1) // active block not found 

for (/ = 1; ; ^ m; / = j + l) 

if (empty(Xy)) { write-new (f, j/y) ; break;} 

if (;' == m + 1) //no empty blocks remain 
return E; 



(!/ll!/2l ■ ■ ■ \y2m) = {Xl\x2\ ■ ■ ■ \x2m)} 
{u[\u'2\ ■ ■ ■ |M2jt-2) = (wi|m2| ' ' ' Wlk-l) } 

for {i = j = l; i^2m; ; = / + !) 

if (full(xy)) continue; 
while (full(Mf)) e = i + l; 
if {u{ ==i + l) 

increment ; 

if (full (1/^) u'g = q^'-l} 

break; 

} 

else i = i + l} 

} 

if (7 == 2m + 1) // active block not found 
for (£ = j = l} 7^ 2m; ; = / + !) 

if (full(xy)) continue; 
while (full(M^)) e = i + l; 
if (ue==0) 

u'^ = i + l; 

if (parity (xy) 7^ i^;) increment (i/y) ; 

if (fulKy.)) M^ = <-1; 

break; 

} 

else £ = £ + 1; 

} 

if (7== 2ffz + 1 ) //no more available live blocks 
return E; 



Transition procedure : Partition the memory into 2 [n /k\ 
parity blocks of k/2 cells, and identify the nii ^ 2(fc— 1) par- 
ity blocks xi,X2,- ■ - rXnii that are not full. If nii < k, output 
the erasure symbol E and terminate. Otherwise, set the 2(fc— 1) 
index blocks ui, U2, ■ ■ ■ , u^k-i as follows: 

{i for i = 1,2, . . . ,k 

foxi = k + l,k + 2,...,mi (4) 

q^-\ for f = mi+l,mi+2, . . .,2fc-2 

where \x= [log^(A:+2)] is the number of cells in each index 
block, then record the information vector {vq, v\, . . . , t^jt-i) in 
the first k live parity blocks xi, X2, ■ ■ ■ , x^, as follows: 

for (f = l; z ^ k; i = i + l) 
if (parity (X, ) 7^ J7,_i) 

{ increment (x,) ; if (full(x,)) Ui = q^ — 1;} 



The function full (x) determines whether the given block x 
(which could be a parity block or an index block) is full. The 
function increment {x) increases by one the level of a cell 
(does not matter which) in the given live block. 

During second-stage encoding and decoding, we will need 
to figure out for each active parity block x which of the k in- 
formation bits it represents. To this end, we will have to find 
and read the index block u that corresponds to x. How exactly 
is the correspondence between parity blocks and index blocks 
established? Note that, upon the completion of the transition 
procedure Ty, there is the same number of live parity blocks 
and live index blocks; moreover, the 7-th live index block cor- 
responds to the 7-th live parity block, for all 7. The encoding 
procedure will make sure that this correspondence is preserved 
throughout the second stage: whenever a parity block becomes 
full, it will make the corresponding index block full as well. 

We are now ready to present the encoding and decoding maps 
which are, again, specified in C-like pseudo-code notation. 
Decoding map 2?i : The input to this map is a cell-state vector 
X = {xi\x2\ ■ ■ ■ \x2m\\ M1IW2I ■ ■ ■ \u2k-2)' partitioned into 2m 
parity blocks, of k/2 cells each, and 2{k—l) index blocks. The 
output is the information vector {vq, v\, . . . , Vj^_i). 

{vQ,vi,...,Vk_i) = (0,0,. . .,0); 
for {e = j = l; 7^ 2m; 7 = 7 + 1) 

if (full(xy)) continue; // skip full blocks 
while (full(i<^)) £ = e + l} // skip full blocks 

i = ur, i = i + i} 

if (i 7^ 0) Vi^i = parity {x;) ; 



Given an index i of the bit that has changed, the encoding map 
Si first tries to find an active parity block x that represents the 
f-th information bit. If such a block is found, it is incremented 
and checked for getting full (in which case the corresponding 
index block is set to q^^ — 1). If not, another live parity block is 
allocated to represent the f-th information bit. If no more live 
parity blocks are available, the erasure symbol E is returned. 
Encoding map Si : The input to this map is a cell-state vector 
X = {xi\x2\ ■ ■ ■ \x2m\\ mi|m2| ' ' ' |m2*:-2)' partitioned into 2m 
parity blocks and 2(A:— 1) index blocks, and an index i of the 
information bit that changed. Its output is either a cell-state vec- 
tor y = (1/1I1/2I ■ ■ ■ ll/2mll "'il'^y ■ ■ ■ W2k-2) or the symbol E. 



Note that when the second encoding stage terminates, there 
are at most k — 1 parity blocks that are not full, comprising at 
most k{k-l)/2 cells (at most k{k - l){q-l) /2 cell-levels). 

Once the maps Di and Si are understood, it becomes clear 
that the same approach can be applied iteratively. The result- 
ing flash code C* will proceed, sequentially, through s differ- 
ent encoding stages So,Si, . . . ,Ss-i, where s = [log2A:]. In 
describing this code, we shall assume for the sake of simplic- 
ity that A: is a power of two, that is A: = 2^^. If not, the same 
code can be used to store 2^" > k information bits, of which the 
last 2^" — k are set to zero. Note that this will not change the 
order of the resulting write deficiency. 

To accommodate the encoding maps f 1, f 2/ • ■ • / ^s-l. we set 
aside for each map a batch of 2(A: — 1) index blocks, with each 
index block consisting of 7X = [log^(A:+2)] cells. The transi- 
tion procedure % which bridges between the encoding maps 
Sy-i and Sr (for some r G {2, 3, . . . , s— 1}) is identical to the 
transition procedure 7i, except for the following differences: 
Dl. The r-th batch of index blocks is used; and 
D2. The parity blocks consist of k/2^ cells each. 
In addition to Dl and D2, the decoding/encoding maps D,- and 
Sr differ from Di and Si in that "2m" should be replaced by 
"2''m" throughout, where m stands for \ji/k\ as before. There 
are no other differences. 

Theorem 2. For s = [log2A:] , the write deficiency of the flash 
code C* defined by the sequence of decoding/encoding maps 
Vq, T>i,..., T>s-i and So, Si,..., Ss-i is 0{qk log^fc/log q) . 

Proof. We consider the worst-case scenario for the number 
of cell levels that are either unused or "wasted" in the overall 
encoding procedure. As before, there are at most k — 1 cells 
that are unused due to the partition into [n/k] blocks, of ex- 



actly k cells each, at the very first encoding stage. These cells 
contribute at most {q—l){k — 1) unused cell levels. The index 
blocks for the s — 1 encoding maps 81,62, ■ ■ - ,£5-1 contain 
2(A;— l)(s — 1)^ cells altogether, thereby wasting at most 



2(,-l)(l-l)(s-l)[logit+2)l 



0(&^) (5, 

V log'? / 

cell levels. In each of the s — 1 transition procedures, the situ- 
ation parity {Xj) 7^ f ,_i can occur at most k times, and each 
time it occurs a single cell level is wasted. Finally, as in Theo- 
rem[Tl when the encoding process £0, £1, ■ ■ ■ , £s-i terminates 
there are at most k — 1 parity blocks that are not full and, in 
the worst case, each of them uses just one cell level. How- 
ever, now these parity blocks contain only [/c/2''~^] =2 cells 
each, and thus contribute at most [k — l)(2<j — 3) unused cell 
levels. Putting all of this together, we find that at most 

(<?-l)(fc-l)(2(s-l)riog,(fc+2)l +3) + fc(s-l) (6) 

cell levels are wasted or left unused. Clearly, this expression is 
dominated by (|5]l, and thus bounded by Ol^qk\o^k/\o^q) . | 

For large q, the upper bound of 0[qk\o^k/\o'gq) on the 
deficiency of our scheme can be improved by using a more ef- 
ficient "packaging" of index blocks in the flash memory. As 
before, we allocate a batch of 2(A: — 1) index blocks to each 
encoding stage except £q. But now, every index block will oc- 
cupy [i' = [log2(A:+2)] cells rather than ^ = \\o^^{k+2)^^ 
cells, and the indices will be written in binary rather than in 
the base-iy number system. This allows index blocks that cor- 
respond to successive encoding stages to be "stacked on top 
of each other" in the same memory cells. Specifically, the en- 
coding stage £1 will use only cell levels and 1 to record the 
indices in its index blocks. Once this stage is over, the index 
information recorded during Ti and £1 is no longer relevant, 
and the level of all the 2 (A: — 1)^' cells in the 2(A: — 1) index 
blocks can be raised to 1. Thereafter, provided q ^ 3, the tran- 
sition procedure T2 and the encoding map £2 can use cell lev- 
els 1 and 2 to record the relevant index information in the same 
memory cells. Proceeding in this manner, we can accommo- 
date up to (J — 1 batches of index blocks in 2(A; — 1)^' memory 
cells. We shall refer to this indexing scheme as stacked binary 
indexing and denote the resulting flash code by C 

Theorems. The write deficiency of the flash code <C' defined 
by the sequence of decoding/encoding maps Vq, Vi, . . . , X's-i 
and £0, £1, . ■ . , £s-i that use stacked binary indexing is at most 
0{qk log k) if q ^ log2 k, and at most 0{k log^ k) otherwise. 

Proof. With stacked binary indexing, the number of cell le- 
vels wasted in all the 2(A:— l)(s — 1) index blocks is at most 

2(<?-l)(fc-l) [J^l riog2(fc+2)l (7) 

Although for most values of k and q this is strictly less than (|5]l, 
all the other terms in (|6]l are still dominated by (|7|. | 

Remark. If we need to store k symbols, rather than bits, over 
an alphabet of size £ > 2, the same flash code can still be used, 
with an appropriate interface. With the linear womcode of [7], 
the i-ary symbols can be represented using £ — 1 bits in such 
a way that any symbol change corresponds to a single bit tran- 
sition. The flash code C' can be now applied as is, and the re- 
sulting write deficiency is 0(max{(j, log2fc£}fc£logM). 



V. Flash Codes of Constant Rate 
All of our results so far pertain to the case where n ^ k^. In 
this section, we briefly examine the situation where both k and 
n are large, while k/n = R for some constant R < 1. Observe 
that write deficiency i5(C) = n{q—l) — f is not an appropriate 
figure of merit in this situation: a trivial code that guarantees 
t = writes achieves write deficiency n{q—l) = k{q—l)/R, 
which is within a constant factor 2/R from the lower bound ([T]) 
Thus we will state our results in terms of the guaranteed num- 
ber of writes t rather than the write deficiency ^(C). 

If q = 2, we can easily guarantee D.{n/logk) writes as fol- 
lows: partition the n cells into blocks of size [log2A:] and each 
time an information bit changes, record its index in the next 
available block. For q > 2, the same method guarantees about 
[n/log^/cj = D.{n log (j/ log /c) writes, but we can do better. 

Let us partition the n cells into two groups: the index group 
consisting of n — k cells and the parity group consisting of k 
cells. The index group is then subdivided into m = l{n—k)/s\ 
blocks, each consisting of s = [log2fc] cells. The writing pro- 
ceeds in q — 1 phases. During the first phase, every time an 
information bit changes, its index is recorded in binary (using 
cell levels and 1) in the next available index block. After m 
writes, the first phase is over We then copy the k information 
bits into the k cells of the parity group, and raise the level of 
all cells in the index group to 1. The second phase can now 
proceed using cell levels 1 and 2, and recording changes in in- 
formation bits relative to the values stored in the parity group. 
At the end of the second phase, the current values of the k bits 
are recorded in the parity cells using levels 1 and 2. And so on. 
This simple coding scheme achieves 

niq-l){l-R) 



m{q-l) 



CI 



nq 



(8) 



log2fc \logfc 

writes (where the middle expression ignores ceilings/floors by 
assuming that A: is a power of two and that n — k is divisible 
by log2 k). If q is odd and R ^ 0.415, we can do a little better 
by using the ternary number system (cell levels 0, 1, 2) in both 
the index group and the parity group. In this case, the size of 
the parity group is [fc/log23] cells and 1 — R in ([8]l can be re- 
placed by (log23 - R) 12. Finally, for all R ^ 0.755 and - 1 
divisible by three, the quaternary alphabet is optimal, leading 
to a factor of (2 - R) /3 rather than 1 - R in dH). 
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